-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenCL] Change of OpenCL profiling logic #11180
Conversation
15f2d6e
to
8246261
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general LGTM. Thanks
@masahi could you please take a look at this PR? |
* Enable profiling only when it is used explicitly * Change logic of clCommandQueue create/destroy * Update comments * Linter fix * Refactor queue create * Move queue recreation logic to function * Replace profiling flag by the queue info request * Enhance readability * Fix linter errors
* Enable profiling only when it is used explicitly * Change logic of clCommandQueue create/destroy * Update comments * Linter fix * Refactor queue create * Move queue recreation logic to function * Replace profiling flag by the queue info request * Enhance readability * Fix linter errors
* Enable profiling only when it is used explicitly * Change logic of clCommandQueue create/destroy * Update comments * Linter fix * Refactor queue create * Move queue recreation logic to function * Replace profiling flag by the queue info request * Enhance readability * Fix linter errors
* Enable profiling only when it is used explicitly * Change logic of clCommandQueue create/destroy * Update comments * Linter fix * Refactor queue create * Move queue recreation logic to function * Replace profiling flag by the queue info request * Enhance readability * Fix linter errors
This PR causes CLML profiling failure. Reason explained below In general the workspaces can be accessed and shared via “device_api.opencl”. CLML integration shares the workspace created by default OpenCL and it has a reference to the command queue. Changing the command queue in between makes them invalid. Why do we enable profiler in compilation when we don't want to profile any thing ? At any case dynamically recreating the queue will cause issues for other components. |
Profiling in TVM is enabled or disable in compile time by the
USE_PROFILER
switch. It means that if we enable profile inconfig.cmake
, but do not use any profiling features in the app, OpenCL is forced to collectcl_events
objects.Build TVM with
set(USE_PROFILER ON)
.Consider simple app, where we create module from the
.so
file:Then we collect memory usage info with Valgrind.
We do not use any profiling info, but it is collected implicitly because of the compile-time switches:
tvm/src/runtime/opencl/opencl_device_api.cc
Line 431 in 6babb89
tvm/src/runtime/opencl/opencl_module.cc
Line 84 in 6babb89
With the proposed modifications this behavior is changed:
clCommandQueue
by default is created in the normal mode and is recreated with profiling capabilities when user calls profiler explicitly. When a profiling session is finished, the queue is recreated again in normal mode, which allows to mixprofile()
calls andrun()
calls.With the proposed changes valgrind shows no abnormal memory usage for the example above.